Enhancing a Dictionary for Transfer Rule Acquisition
نویسندگان
چکیده
The JMdict/EDICT Japanese-English Dictionary is a freely-available dictionary distributed in XML (JMdict)and text (EDICT) formats. It is widely used as a source of lexical material in dictionary systems and text-processing projects. We propose two refinements to make the dictionary more computationally tractable: marking entries where the English is not a translation equivalent and expanding contracted entries. We then propose and apply semi-automatic methods to refine existing entries. The resulting dictionary is shown to be more suitable for the construction of machine translation rules.
منابع مشابه
Combining Resources for Open Source Machine Translation
In this paper, we present a Japanese→English machine translation system that combines rule-based and statistical translation. Our system is unique in that all of its components are freely available as open source software. We describe the development of the rule-based translation engine including transfer rule acquisition from an open bilingual dictionary. We also show how translations from bot...
متن کاملA System for Syntactic Structure Transfer from Malayalam to English
This paper describes the design and development of a system for syntactic structure transfer of Malayalam sentences to English. A syntactic structure transfer module is required in machine translation systems using a transfer based approach. The system uses a rule based approach. It makes use of rules of morphology of both Malayalam and English and syntactic structure transfer rules between Mal...
متن کاملA New IRIS Segmentation Method Based on Sparse Representation
Iris recognition is one of the most reliable methods for identification. In general, itconsists of image acquisition, iris segmentation, feature extraction and matching. Among them, iris segmentation has an important role on the performance of any iris recognition system. Eyes nonlinear movement, occlusion, and specular reflection are main challenges for any iris segmentation method. In thi...
متن کاملA New IRIS Segmentation Method Based on Sparse Representation
Iris recognition is one of the most reliable methods for identification. In general, itconsists of image acquisition, iris segmentation, feature extraction and matching. Among them, iris segmentation has an important role on the performance of any iris recognition system. Eyes nonlinear movement, occlusion, and specular reflection are main challenges for any iris segmentation method. In thi...
متن کاملEnhancing Morphological Analyzers by Unknown Word Decomposition
This paper describes an approach how to integrate the decomposition of non-lexicalized word compounds and derivations into the morphological analyzers of a company's NLP product line. The component employs word formation rules and filtering techniques to decompose words, which are not contained in the underlying dictionary database, thereby increasing the average word recognition rate of the mo...
متن کامل